A flexible rank-based framework for detecting copy number aberrations from array data
نویسندگان
چکیده
MOTIVATION DNA copy number aberration--both inherited and sporadic--is a significant contributor to a variety of human diseases. Copy number characterization is therefore an area of intense research. Probe hybridization-based arrays are important tools used to measure copy number in a high-throughput manner. RESULTS In this article, we present a simple but powerful nonparametric rank-based approach to detect deletions and gains from raw array copy number measurements. We use three different rank-based statistics to detect three separate molecular phenomena-somatic lesions, germline deletions and germline gains. The approach is robust and rigorously grounded in statistical theory, thereby enabling the meaningful assignment of statistical significance to each putative aberration. We demonstrate the flexibility of our approach by applying it to data from three different array platforms. We show that our method compares favorably with established approaches by applying it to published well-characterized samples. Power simulations demonstrate exquisite sensitivity for array data of reasonable quality. CONCLUSIONS Our flexible rank-based framework is suitable for multiple platforms including single nucleotide polymorphism arrays and array comparative genomic hybridization, and can reliably detect gains or losses of genomic DNA, whether inherited, de novo, or somatic. AVAILABILITY An R package RankCopy containing the methods described here, and is freely available from the author's web site (http://mendel.gene.cwru.edu/laframboiselab/). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Multi-factor data normalization enables the detection of copy number aberrations in amplicon sequencing data
MOTIVATION Because of its low cost, amplicon sequencing, also known as ultra-deep targeted sequencing, is now becoming widely used in oncology for detection of actionable mutations, i.e. mutations influencing cell sensitivity to targeted therapies. Amplicon sequencing is based on the polymerase chain reaction amplification of the regions of interest, a process that considerably distorts the inf...
متن کاملQuantification of Multiple Tumor Clones Using Gene Array and Sequencing Data.
Cancer development is driven by genomic alterations, including copy number aberrations. The detection of copy number aberrations in tumor cells is often complicated by possible contamination of normal stromal cells in tumor samples and intratumor heterogeneity, namely the presence of multiple clones of tumor cells. In order to correctly quantify copy number aberrations, it is critical to succes...
متن کاملDetecting copy number variants and runs of homozygosity on a single array — challenges and applications
In constitutional genetics research, analysis of single nucleotide polymorphisms (SNPs) provides invaluable insight into a number of conditions. When analysed in conjunction with copy number variation (CNV) data from array comparative genomic hybridisation (aCGH) arrays, this insight can aid in the identification of additional genetic variants to those yielded by the CNV data alone. Protocols f...
متن کاملDetecting Copy Number Variations from Array CGH Data Based on a Conditional Random Field Model
Array comparative genomic hybridization (aCGH) allows identification of copy number alterations across genomes. The key computational challenge in analyzing copy number variations (CNVs) using aCGH data or other similar data generated by a variety of array technologies is the detection of segment boundaries of copy number changes and inference of the copy number state for each segment. We have ...
متن کاملA Bayesian Analysis for Identifying DNA Copy Number Variations Using a Compound Poisson Process
To study chromosomal aberrations that may lead to cancer formation or genetic diseases, the array-based Comparative Genomic Hybridization (aCGH) technique is often used for detecting DNA copy number variants (CNVs). Various methods have been developed for gaining CNVs information based on aCGH data. However, most of these methods make use of the log-intensity ratios in aCGH data without taking ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 25 6 شماره
صفحات -
تاریخ انتشار 2009